Monte Carlo Simulations for Cost Benefit Analysis

A discussion

Zac Payne-Thompson

The Department for Culture, Media and Sport

November 2, 2023

Contents

Pre-requisites

  • Brief History of CBA
  • Outline of current workflow in green book
  • What do we mean by sensitivities?

Analysis

  • Overview of distributions
  • Monte Carlo analysis
  • Interlude on Optimism Bias
  • Distributional approaches

Contents

Evaluation

  • What would this mean in practice?

Reflection

  • Does this matter?

A Brief History of CBA

Historical Development of CBA

The New Deal Era

Addressing the issues of Pareto Efficiency

The Hicks-Kaldor Compensation Principle

Definition

An allocative (i) change (ii) increases efficiency if the gainers from the change are (iii) capable of compensating the losers and still coming out ahead.

Each individual’s gain or loss is defined as the value of a hypothetical monetary compensation that would keep each individual (in his or her own judgement) indifferent to the change

Cost-benefit analysis examines whether policy changes satisfy the compensation principle or not

Addressing the issues of Pareto Efficiency

The Hicks-Kaldor Compensation Principle

CBA in Practice

CBA Popularity and Doubts in the 1960s and 1970s

  • In the 1960s, Cost-Benefit Analysis (CBA) gained popularity, even though there was no clear consensus on its theoretical foundation.
  • Government agencies and applied economists embraced CBA during this period.
  • However, by the 1970s, doubts started to emerge regarding the utility of CBA, both theoretically and practically.
  • These doubts were not just theoretical but also related to challenges and criticisms in applying CBA to real-world decision-making.

CBA in Practise

The Real Practice of CBA

  • While CBA is taught in textbooks with specific methodologies, its practical application in government agencies often differs.
  • Agencies may adapt CBA to their specific needs, using it as a tool to rationalize decisions made for various reasons, including political and administrative considerations.
  • It’s not uncommon for agencies to deviate from standard CBA procedures without always providing a clear rationale.
  • The actual practice of CBA can be influenced by external factors such as legal constraints, data availability, and practical limitations.

CBA in the Green Book

CBA in the Green Book

The Appraisal Process

  1. Define the Problem
  2. Establish Objectives
  3. Identify Options
  4. Appraise Options
  5. Sensitivity Analysis
  6. Decision Making
  7. Implementation and Monitoring

CBA in the Green Book

Sensitivity Analysis

Definition

  • Sensitivity analysis explores how the expected outcomes of an intervention are sensitive to variations in key input variables.
  • It helps understand the impact of changing assumptions on project feasibility and preferred options.

A key concept is the Switching Value: The value at which a key input variable would need to change to switch from a recommended option to another or for a proposal not to receive funding.

Identifying switching values is crucial to decision-making.

CBA in the Green Book

Sensitivity Analysis

Variable Value
Site area 39 acre
Existing use land value estimate £30,659 per acre
Future use land value estimate £200,000 per acre
Land value uplift per acre £169,341 per acre
Total land value uplift £6.6m
Wider social benefits £1.4m
Present Value Benefits (PVB) £8m
Present Value Cost (PVC) £10m
Benefit Cost Ratio (BCR = PVB / PVC) 0.8
Net Present Social Value (NPSV) -£2m

CBA in the Green Book

Optimism Bias

Definition

Optimism bias is the demonstrated systematic tendency for appraisers to be over-optimistic about key project parameters, including capital costs, operating costs, project duration and benefits delivery.

  • Adjust for optimism bias to provide a realistic assessment of project estimates.

  • Adjustments should align with risk avoidance and mitigation measures, with robust evidence required before reductions.

  • Apply optimism bias adjustments to operating and capital costs. Use confidence intervals for key input variables when typical bias measurements are unavailable.

CBA in the Green Book

Monte Carlo Analysis

Note

Monte Carlo analysis is a simulation-based risk modelling technique that produces expected values and confidence intervals. The outputs are the result of many simulations that model the collective impact of a number of uncertainties.

It is useful when there are a number of variables with significant uncertainties, which have known, or reasonably estimated, independent probability distributions.

It requires a well estimated model of the likely impacts of an intervention and expert professional input from an operational researcher, statistician, econometrician, or other experienced practitioner.

Monte Carlo Simulations for CBA

Monte Carlo Simulations for CBA

Data and Setup

project_id low central high
1 64.37888 159.9989 223.8726
2 89.41526 133.2824 296.2359
3 70.44885 148.8613 260.1366
4 94.15087 195.4474 424.4830
5 97.02336 148.2902 471.6297
6 52.27782 189.0350 288.0247
7 76.40527 191.4438 236.4092
8 94.62095 160.8735 684.4839

Monte Carlo Simulations for CBA

Data and Setup

  • Objective: Create functions to generate different cost distributions based on user-specified parameters.

  • Process:

    • Each function generates a sequence of possible project-level costs based on user-defined “high” and “low” values.
    • Depending on the chosen distribution assumption, a probability distribution function is applied to create a vector of probabilities.
    • The sample() function is used to randomly sample cost values from the sequence, with replacement, using the assumed probability distribution.

Monte Carlo Simulations for CBA

Data and Setup

  • Total Cost Distributions:
    • These functions are applied to the project dataset to calculate total costs.
    • The result is a vector of possible total project costs that can be plotted as a distribution.
    • This approach allows for the exploration of different cost scenarios and provides a basis for risk analysis in project management.

Monte Carlo Simulations for CBA

1) Uniform Distribution

Project costs are modeled using a uniform distribution spanning low to high.

uniform_1 <- function(low, high){
  
}

Monte Carlo Simulations for CBA

1) Uniform Distribution

Project costs are modeled using a uniform distribution spanning low to high.

uniform_1 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
}

Monte Carlo Simulations for CBA

1) Uniform Distribution

Project costs are modeled using a uniform distribution spanning low to high.

uniform_1 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Uniform Probability distribution function
  distribution <- dunif(sequence, min = low, max = high)
  
}

Monte Carlo Simulations for CBA

1) Uniform Distribution

Project costs are modeled using a uniform distribution spanning low to high.

uniform_1 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Uniform Probability distribution function
  distribution <- dunif(sequence, min = low, max = high)
  
  # Sampling from possible costs using the assumed distribution function
  sample(x = sequence, size = 10000, replace = T, prob = distribution)
  
}

Monte Carlo Simulations for CBA

1) Uniform Distribution

Monte Carlo Simulations for CBA

1) Uniform Distribution

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

Project costs are modeled using a normal distribution with a mean defined as the midpoint between high and low, and a standard deviationthat is 1/4 of the distance between high and low.

This means that, if the data is truly normally distributed, then the low and high estimates represent the 95% confidence interval for an individual project’s cost.

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)

}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the midpoint between low and high
  mean_x = (high-low)/2+low
  
}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the midpoint between low and high
  mean_x = (high-low)/2+low
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the midpoint between low and high
  mean_x = (high-low)/2+low
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
  # Normal Probability Distribution Function
  distribution <- dnorm(sequence, mean = mean_x, sd = sd_x)
  
}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

This function looks like:

normal_2 <- function(low, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the midpoint between low and high
  mean_x = (high-low)/2+low
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
  # Normal Probability Distribution Function
  distribution <- dnorm(sequence, mean = mean_x, sd = sd_x)
  
  # Sampling from possible costs using the assumed distribution function
  sample(x = sequence, size = 10000, replace = T, prob = distribution)
  
}

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

Monte Carlo Simulations for CBA

2) Normal Distribution (without a central estimate)

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the central project cost estimate
  mean_x = central
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the central project cost estimate
  mean_x = central
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the central project cost estimate
  mean_x = central
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
  # Normal Probability Distribution Function
  distribution <- dnorm(sequence, mean = mean_x, sd = sd_x)
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

As before, except the mean of the normal distribution is assumed to be the central value.

normal_3 <- function(low, central, high){
  
  # Set of possible costs
  sequence <- seq(from = 0, to = sum(data$high), by = 1)
  
  # Mean equal to the central project cost estimate
  mean_x = central
  
  # Standard Deviation equal to 1/4 of the distance between low and high
  sd_x = (high-low)/4
  
  # Normal Probability Distribution Function
  distribution <- dnorm(sequence, mean = mean_x, sd = sd_x)
  
  # Sampling from possible costs using the assumed distribution function
  sample(x = sequence, size = 10000, replace = T, prob = distribution)
  
}

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

Monte Carlo Simulations for CBA

3) Normal Distribution (including a central estimate)

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

  • Are costs and benefits really normally distributed?

  • By definition, they can only be positive.

  • But the upper limit could be infinite?

    • What is the real benefit of Net Zero e.g, the existence of the human race?
    • Similarly, what would be the cost of a race of hostile aliens enslaving humanity?
    • In either case - probably a lot!

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

  • Are costs and benefits really normally distributed?

  • By definition, they can only be positive.

  • But the upper limit could be infinite?

    • What is the real benefit of Net Zero e.g, the existence of the human race?
    • Similarly, what would be the cost of a race of hostile aliens enslaving humanity?
    • In either case - probably a lot!

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

A solution

The Log-Normal distribution allows for a right skew and long upper tail while using the same input parameters as a normal distribution.

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

In the context of cost estimation for a project, we can leverage the Cumulative Density Function (CDF) of the Log-Normal distribution to calculate the mu (μ) and sigma (σ) parameters required to achieve a distribution where approximately 95% of estimates fall between the low and high cost estimates.

To achieve this, we need to establish a relationship between our central project cost estimate and the relevant formula. However, this approach relies on an assumption about what the central estimate represents.

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

One potential statistic that relates our three project cost estimates to the distribution parameters is the mode.

Assuming that the central cost estimate represents the most likely outcome, it corresponds to the peak of the probability distribution, making it the mode.

The mode of the Log-Normal distribution is given by the formula:

\[mode = e^{\mu - \sigma^2} = central\]

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

Solving for mu (μ) gives us:

\[\mu = \log(mode) + \sigma^2 = \log(central) + \sigma^2\]

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

  • Therefore, we need to find the value of sigma (σ) that results in approximately 95% of our project cost estimates falling between the high cost and low cost estimates.

  • This can be calculated by finding the difference between the Log-Normal CDF evaluated at the high cost estimate and the low cost estimate.

  • For a practical illustration, we can utilize the data from the first project.

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

First defining an open function

f <- function(sigma){
  
  # The relationship between mode (central), mu and sigma
  mu <- log(data$central[1]) + sigma^2
  
  # The difference between the CDF at high and CDF at low where 95% 
  # of estimates fall
  abs(plnorm(data$high[1], mu, sigma) - plnorm(data$low[1], mu, sigma) - 0.95)
  
}

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

# Selecting the minimum from the tibble, this is the optimal sigma
sigma_test <- optimize(f, lower = 0, upper = 1)$minimum

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

# Selecting the minimum from the tibble, this is the optimal sigma
sigma_test <- optimize(f, lower = 0, upper = 1)$minimum

# Plugging this back into the formula for the mean 
mu_test <- (log(data$central[1]) + sigma_test^2) 

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

# Selecting the minimum from the tibble, this is the optimal sigma
sigma_test <- optimize(f, lower = 0, upper = 1)$minimum

# Plugging this back into the formula for the mean 
mu_test <- (log(data$central[1]) + sigma_test^2) 

# Now using these to simulate a distribution
N <- 10000000
nums <- rlnorm(N, mu_test, sigma_test)

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

# Selecting the minimum from the tibble, this is the optimal sigma
sigma_test <- optimize(f, lower = 0, upper = 1)$minimum

# Plugging this back into the formula for the mean 
mu_test <- (log(data$central[1]) + sigma_test^2) 

# Now using these to simulate a distribution
N <- 10000000
nums <- rlnorm(N, mu_test, sigma_test)

# Now testing how many values lie between Low and High
sum(data$low[1] < nums & nums < data$high[1]) / N

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

# Next using optimize to search the interval from lower to upper for a 
# minimum of the function f with respect to the first argument, sigma.
optimize(f, lower = 0, upper = 1)

# Selecting the minimum from the tibble, this is the optimal sigma
sigma_test <- optimize(f, lower = 0, upper = 1)$minimum

# Plugging this back into the formula for the mean 
mu_test <- (log(data$central[1]) + sigma_test^2) 

# Now using these to simulate a distribution
N <- 10000000
nums <- rlnorm(N, mu_test, sigma_test)

# Now testing how many values lie between Low and High
sum(data$low[1] < nums & nums < data$high[1]) / N
[1] 0.95007

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

Monte Carlo Simulations for CBA

4) Log-Normal Distribution

Monte Carlo Simulations for CBA

Comparison

Reflection

Reflection

Questions for the audience

  • Does this matter?
  • What is the point of CBA?
  • Would this make policy better? And importantly, does it increase VfM for the tax payer?

Thank you